Goto

Collaborating Authors

 Abha


Large Language Models as Search Engines: Societal Challenges

Sadeddine, Zacchary, Maxwell, Winston, Varoquaux, Gaël, Suchanek, Fabian M.

arXiv.org Artificial Intelligence

Large Language Models (LLMs) may one day replace search engines as the primary portal to information on the Web. In this article, we investigate the societal challenges that such a change could bring. We focus on the roles of LLM Providers, Content Creators, and End Users, and identify 15 types of challenges. With each, we show current mitigation strategies -- both from the technical perspective and the legal perspective. We also discuss the impact of each challenge and point out future research opportunities.


Correction of Decoupled Weight Decay

Chou, Jason Chuan-Chih

arXiv.org Artificial Intelligence

Decoupled weight decay, solely responsible for the performance advantage of AdamW over Adam, has long been set to proportional to learning rate γ without questioning. To the contrary, we find that eliminating the contribution of the perpendicular component of the update to the weight norm leads to little change to the training dynamics. For adaptive gradient methods such as SGD with momentum (Sutskever et al., 2013) and Adam (Kingma & Ba, 2015), weight decay is no longer equivalent to L Nevertheless, Defazio (2025) presents experiments on Llama 3 architecture (Grattafiori et al., 2024) in which most layers are not immediately followed by normalization. It states that "we consider every linear layer as normalized, excluding the output layer of the network" for the purpose of applying such corrected weight decay, and AdamC results in more stable weight and gradient norms than the AdamW baseline regardless. Consider the "Renormalized" AdamW optimizer above (Algorithm 1) which eliminates the contribution of u We train a variant of ViT -S/16 based on the setup described in Beyer et al. (2022) on the ImageNet-1k dataset (Russakovsky et al., 2015) for 90 epochs and instead observe almost no differences in relevant metrics (Figure 1).


When Tables Leak: Attacking String Memorization in LLM-Based Tabular Data Generation

Ward, Joshua, Gu, Bochao, Wang, Chi-Hua, Cheng, Guang

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have recently demonstrated remarkable performance in generating high-quality tabular synthetic data. In practice, two primary approaches have emerged for adapting LLMs to tabular data generation: (i) fine-tuning smaller models directly on tabular datasets, and (ii) prompting larger models with examples provided in context. In this work, we show that popular implementations from both regimes exhibit a tendency to compromise privacy by reproducing memorized patterns of numeric digits from their training data. To systematically analyze this risk, we introduce a simple No-box Membership Inference Attack (MIA) called LevAtt that assumes adversarial access to only the generated synthetic data and targets the string sequences of numeric digits in synthetic observations. Using this approach, our attack exposes substantial privacy leakage across a wide range of models and datasets, and in some cases, is even a perfect membership classifier on state-of-the-art models. Our findings highlight a unique privacy vulnerability of LLM-based synthetic data generation and the need for effective defenses. To this end, we propose two methods, including a novel sampling strategy that strategically perturbs digits during generation. Our evaluation demonstrates that this approach can defeat these attacks with minimal loss of fidelity and utility of the synthetic data.


Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training

Krajewski, Jakub, Shidani, Amitis, Busbridge, Dan, Wiseman, Sam, Ramapuram, Jason

arXiv.org Artificial Intelligence

Large Language Models (OpenAI et al., 2024; Team et al., 2025; DeepSeek-AI et al., 2025) based on the Transformer (Vaswani et al., 2023) architecture have achieved impressive results, approaching or exceeding human-level performance across multiple domains. Scaling laws (Hestness et al., 2017; Kaplan et al., 2020) are an established method for modeling the performance of these networks, enabling researchers to plan large-scale training runs based on curated sets of smaller experiments. Traditionally, these laws focus on predicting proxy metrics for model quality, such as pre-training log-perplexity. This has proven invaluable for optimizing training hyperparameters, like the optimal ratio of tokens to parameters. Another important direction in understanding the scaling of LLMs is tracking the behavior of more interpretable indicators of model capabilities, like accuracy on downstream benchmarks measuring the performance on general knowledge, reasoning, math and coding tasks. Despite early attempts to solve this problem (Grattafiori et al., 2024; Isik et al., 2025; Chen et al., 2025), scaling downstream metrics have been often referred to as noisy and unreliable (Schaeffer et al., 2025; Lourie et al., 2025). Current approaches to modeling the downstream performance performance of LLMs (Grattafiori et al., 2024; Chen et al., 2025; Bhagia et al., 2024) typically rely on a two-stage approach, where the training budget is first mapped to a proxy metric like mean log-probability of the correct answer, and then another dependence is established, mapping to benchmark accuracy. Work done as an intern at Apple.


General Exploratory Bonus for Optimistic Exploration in RLHF

Li, Wendi, Oh, Changdae, Li, Sharon

arXiv.org Artificial Intelligence

Optimistic exploration is central to improving sample efficiency in reinforcement learning with human feedback, yet existing exploratory bonus methods to incentivize exploration often fail to realize optimism. We provide a theoretical analysis showing that current formulations, under KL or $α$-divergence regularization, unintentionally bias exploration toward high-probability regions of the reference model, thereby reinforcing conservative behavior instead of promoting discovery of uncertain regions. To address this pitfall, we introduce the General Exploratory Bonus (GEB), a novel theoretical framework that provably satisfies the optimism principle. GEB counteracts divergence-induced bias via reference-dependent reward regulation and unifies prior heuristic bonuses as special cases, while extending naturally across the full $α$-divergence family. Empirically, GEB consistently outperforms baselines on alignment tasks across multiple divergence settings and large language model backbones. These results demonstrate that GEB offers both a principled and practical solution for optimistic exploration in RLHF.


Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models

Li, Hengzhuang, Zhang, Xinsong, Peng, Qiming, Luo, Bin, Hu, Han, Jiang, Dengyang, Ye, Han-Jia, Zhang, Teng, Jin, Hai

arXiv.org Artificial Intelligence

Multimodal Large Language Models (MLLMs) have demonstrated remarkable proficiency in multimodal tasks. Despite their impressive performance, MLLMs suffer from the modality imbalance issue, where visual information is often underutilized compared to textual representations in deeper layers, leading to degraded visual performance or hallucinations. This issue stems from the predominant reliance on next-text-token-prediction during training, which fails to provide direct visual supervisory signals, resulting in progressive homogenization of visual representations throughout the layers. To this end, we propose Latent Visual Reconstruction (LaVer), a novel training framework that facilitates MLLMs in learning more discriminative visual representations via masked image modeling in the joint latent semantic space of LLM. Our method offers direct visual activation to MLLMs, which exhibit increased visual attention allocation, indicating enhanced utilization of visual information. Extensive experiments across diverse benchmarks prove the superiority of our approach in various scenarios, especially those requiring dense visual capabilities. Code of LaVer is available at https://github.com/Fir-lat/LaVer.


Automated Data Enrichment using Confidence-Aware Fine-Grained Debate among Open-Source LLMs for Mental Health and Online Safety

Mao, Junyu, Hills, Anthony, Tseriotou, Talia, Liakata, Maria, Shamir, Aya, Sayda, Dan, Atzil-Slonim, Dana, Djohari, Natalie, Mandal, Arpan, Roth, Silke, Ugwudike, Pamela, Niranjan, Mahesan, Middleton, Stuart E.

arXiv.org Artificial Intelligence

Real-world indicators are important for improving natural language processing (NLP) tasks such as life events for mental health analysis and risky behaviour for online safety, yet labelling such information in NLP training datasets is often costly and/or difficult given the dynamic nature of such events. This paper compares several LLM-based data enrichment methods and introduces a novel Confidence-Aware Fine-Grained Debate (CFD) framework in which multiple LLM agents simulate human annotators and exchange fine-grained evidence to reach consensus. We describe two new expert-annotated datasets, a mental health Reddit wellbeing dataset and an online safety Facebook sharenting risk dataset. Our CFD framework achieves the most robust data enrichment performance compared to a range of baselines and we show that this type of data enrichment consistently improves downstream tasks. Enriched features incorporated via debate transcripts yield the largest gains, outperforming the non-enriched baseline by 10.1% for the online safety task.


SystolicAttention: Fusing FlashAttention within a Single Systolic Array

Lin, Jiawei, Li, Yuanlong, Chen, Guokai, Bourgeat, Thomas

arXiv.org Artificial Intelligence

Transformer models rely heavily on the scaled dot-product attention (SDPA) operation, typically implemented as FlashAttention. Characterized by its frequent interleaving of matrix multiplications and softmax operations, FlashAttention fails to fully utilize the compute resources of modern systolic-array-based accelerators designed for consecutive and large matrix multiplications. To fully unleash the performance potential of systolic arrays for FlashAttention, we propose FSA, an enhanced systolic array architecture that runs the entire FlashAttention on the array without external vector units. Combined with SystolicAttention, an optimized kernel for FSA that achieves fine-grained and element-wise overlapping of FlashAttention operations, FSA maximizes array utilization while preserving the original floating-point operation order of FlashAttention. We implement FSA in synthesizable RTL and evaluate its performance against state-of-the-art systolic-array-based accelerators. Our results show that FSA achieves 1.77x and 4.83x higher attention FLOPs/s utilization compared to AWS Neuron-v2 and Google TPUv5e, respectively. We synthesize FSA in a 16 nm technology at 1.5 GHz, and results indicate only a 12% area overhead compared to a standard weight-stationary systolic array.


Hallucination reduction with CASAL: Contrastive Activation Steering For Amortized Learning

Wannan, null, Yang, null, Qiu, Xinchi, Yu, Lei, Zhang, Yuchen, Yang, Aobo, Kokhlikyan, Narine, Cancedda, Nicola, Garcia-Olano, Diego

arXiv.org Artificial Intelligence

Large Language Models (LLMs) exhibit impressive capabilities but often hallucinate, confidently providing incorrect answers instead of admitting ignorance. Prior work has shown that models encode linear representations of their own knowledge and that activation steering can reduce hallucinations. These approaches, however, require real-time monitoring and intervention during inference. We introduce Contrastive Activation Steering for Amortized Learning (CASAL), an efficient algorithm that connects interpretability with amortized optimization. CASAL directly bakes the benefits of activation steering into model's weights. Once trained, LLMs answer questions they know while abstaining from answering those they do not. CASAL's light-weight design requires training only a submodule of a single transformer layer and yet reduces hallucination by 30%-40% across multiple short-form QA benchmarks. CASAL is 30x more compute-efficient and 20x more data-efficient than strong LoRA-based baselines such as SFT and DPO, boosting its practical applicability in data scarce domains. Importantly, CASAL also generalizes effectively to out-of-distribution (OOD) domains. We showcase CASAL's flexibility in mitigating hallucinations in both text-only and vision-language models. To our knowledge, CASAL is the first steering-based training method that has been shown to be effective for both dense and Mixture-of-Experts (MoE) models. CASAL represents a promising step forward for applying interpretability-inspired method for practical deployment in production systems.


CoGraM: Context-sensitive granular optimization method with rollback for robust model fusion

Lenz, Julius

arXiv.org Artificial Intelligence

Merging neural networks without retraining is central to federated and distributed learning. Common methods such as weight averaging or Fisher merging often lose accuracy and are unstable across seeds. CoGraM (Contextual Granular Merging) is a multi-stage, context-sensitive, loss-based, and iterative optimization method across layers, neurons, and weight levels that aligns decisions with loss differences and thresholds and prevents harmful updates through rollback. CoGraM is an optimization method that addresses the weaknesses of methods such as Fisher and can significantly improve the merged network.